-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix: add batch size limits for embedding models to support Aliyun Bailian #7704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add maxBatchSize property to EmbeddingModelProfile interface - Add batch size limits for Aliyun Bailian models (qwen3-embedding, text-embedding-v4) - Update OpenAICompatibleEmbedder to respect model-specific batch limits - Update service factory to pass batch size limits to embedders and processors - Add comprehensive tests for batch size limiting functionality Fixes #7702
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.
| if (modelMaxBatchSize && modelMaxBatchSize < batchSize) { | ||
| batchSize = modelMaxBatchSize | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice there's duplicate logic here between createDirectoryScanner (lines 182-188) and createFileWatcher (lines 216-222) for checking model-specific batch size limits. Could we extract this into a helper method like getEffectiveBatchSize() to avoid the duplication?
| }, | ||
| // Aliyun Bailian models with batch size limits | ||
| "qwen3-embedding": { dimension: 1536, scoreThreshold: 0.4, maxBatchSize: 10 }, | ||
| "text-embedding-v4": { dimension: 1536, scoreThreshold: 0.4, maxBatchSize: 10 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other Aliyun Bailian models that might need similar batch size limits? Currently we only have qwen3-embedding and text-embedding-v4 configured. It might be worth checking their documentation for other models that could benefit from this.
| dimension: number | ||
| scoreThreshold?: number // Model-specific minimum score threshold for semantic search | ||
| queryPrefix?: string // Optional prefix required by the model for queries | ||
| maxBatchSize?: number // Maximum number of items that can be sent in a single batch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding more detailed documentation here to explain when and why this limit is needed. For example: 'Maximum number of items that can be sent in a single batch. Some providers (e.g., Aliyun Bailian) impose strict batch size limits on their embedding APIs.'
| expect(mockEmbeddingsCreate.mock.calls[0][0].input).toHaveLength(10) | ||
| expect(result.embeddings).toHaveLength(10) | ||
| }) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice comprehensive test coverage! Consider adding one more edge case test: what happens when a single text item exceeds both the token limit AND we have a batch size limit? This would ensure the warning is still logged correctly and the item is skipped as expected.
|
Closing, see #7702 (comment) |
Summary
This PR addresses Issue #7702 by implementing configurable batch size limits for embedding models, specifically to support Aliyun Bailian's Qwen3-Embedding models which have a maximum batch size of 10 items.
Problem
The Qwen3-Embedding and text-embedding-v4 models from Aliyun Bailian have a strict batch size limit of 10 items per request. When indexing codebases with more than 10 chunks, the embedding API returns an error:
Solution
Added batch size configuration to embedding model profiles
maxBatchSizeproperty to theEmbeddingModelProfileinterfaceUpdated OpenAICompatibleEmbedder to respect batch limits
Updated service factory to propagate batch limits
Testing
Impact
Fixes #7702
Important
Adds batch size limits for embedding models to support Aliyun Bailian, updating embedders and service factory to respect these limits.
maxBatchSizetoEmbeddingModelProfileinembeddingModels.tsfor model-specific batch size limits.OpenAICompatibleEmbedderinopenai-compatible.tsto respectmaxBatchSize.service-factory.tsto propagate batch size limits to embedders, scanners, and file-watchers.openai-compatible-batch-limit.spec.tsto test batch size limiting functionality.service-factory.spec.tsto mockgetModelMaxBatchSizeand test embedder creation with batch size limits.This description was created by
for b5b90b0. You can customize this summary. It will automatically update as commits are pushed.